SU@PAN'2016: Author Obfuscation
نویسندگان
چکیده
The anonymity of a text’s writer is an important topic for some domains, such as witness protection and anonymity programs. Stylometry can be used to reveal the true author of a text even if s/he wishes to hide his/her identity. In this paper, we present our approach for hiding an author’s identity by masking their style, which we developed for the Author Obfuscation task, part of the PAN-2016 competition. The approach consists of three main steps: the first one is an evaluation of different metrics in the text that can indicate authorship; the second one is application of various transformations, so that those metrics of the target text are adjusted towards the average level, while still keeping the meaning and the soundness of the text; as a final step, we are adding random noise to the text. Our system showed the best performance for masking the author style.
منابع مشابه
Overview of PAN'16 - New Challenges for Authorship Analysis: Cross-Genre Profiling, Clustering, Diarization, and Obfuscation
This paper presents an overview of the PAN/CLEF evaluation lab. During the last decade, PAN has been established as the main forum of digital text forensic research. PAN 2016 comprises three shared tasks: (i) author identification, addressing author clustering and diarization (or intrinsic plagiarism detection); (ii) author profiling, addressing age and gender prediction from a crossgenre persp...
متن کاملEvaluating Safety, Soundness and Sensibleness of Obfuscation Systems
Author masking is the task of paraphrasing a document so that its writing style no longer matches that of its original author. This task was introduced as part of the 2016 PAN Lab on Digital Text Forensics, for which a total of three research teams submitted their results. This work describes our methodology to evaluate the submitted obfuscation systems based on their safety, soundness and sens...
متن کاملAuthor Obfuscation: Attacking the State of the Art in Authorship Verification
We report on the first large-scale evaluation of author obfuscation approaches built to attack authorship verification approaches: the impact of 3 obfuscators on the performance of a total of 44 authorship verification approaches has been measured and analyzed. The best-performing obfuscator successfully impacts the decision-making process of the authorship verifiers on average in about 47% of ...
متن کاملOverview of the Author Obfuscation Task at PAN 2017: Safety Evaluation Revisited
We report on the second large-scale evaluation of style obfuscation approaches in a shared task on author obfuscation, organized at the PAN 2017 lab on digital text forensics. Author obfuscation means to automatically paraphrase a given text such that state-of-the-art authorship verification approaches misjudge a given pair of documents as having been written by “different authors” if in fact t...
متن کاملAuthor Masking using Sequence-to-Sequence Models
The paper describes the approach adopted for Author Masking Task at PAN 2017. For the purpose of masking the original author, we use the combination of methods based either on deep learning approach or traditional methods of obfuscation. We obtain sample of obfuscated sentences from original one and choose best of them using language model. We try to change both the content and length of origin...
متن کامل